Skip to content

Hard tier gates: coverage dimension, no heuristic escapes#61

Merged
luisleo526 merged 1 commit into
mainfrom
tier/hard-gates-coverage
Jul 5, 2026
Merged

Hard tier gates: coverage dimension, no heuristic escapes#61
luisleo526 merged 1 commit into
mainfrom
tier/hard-gates-coverage

Conversation

@luisleo526

Copy link
Copy Markdown
Collaborator

Owner-approved tier-ladder redesign (validation campaign): every tolerance must be principled and visible; a failing gate must name a distinct bug class. TV-side impossibilities keep their explicit channel (per-class profiles, documented anomaly overrides) — hidden rescues are gone.

What changed

Dimension Before After
Coverage not gated — all metrics computed inside the self-selected match window first-class gate vs ALL closed TV trades: excellent ≥99% (or ≤1 unmatched), strong ≥95%, moderate ≥75%
PnL threshold OR 3 circular heuristic escapes threshold only; per-trade forgiveness only when arithmetically explained by that trade's own in-tolerance exit drift (`
Exit threshold OR pnl_validated_exit_noise threshold only
Qty-normalized PnL rescue unbounded (any sizing bug rescaled away) bounded to ±2% sizing drift
MAE "gated" at 5% but waived whenever excellent was otherwise reachable (dead gate) report-only like MFE (intrabar-path-limited); still printed as diagnostic

Kept (principled, visible): cent-rounding ε, near-zero-PnL exclusion, strict/production per-class profiles, fragment consolidation + FIFO schedule scoring, expected_tier anomaly overrides. Output adds a Coverage: line; every previously-scraped line is byte-stable (downstream regex consumers unaffected).

Impact on committed corpus artifacts (same inputs, rubric-only diff)

excellent 229→226, strong 21→23, moderate 1→2, anomaly 1→1

The two new moderates are real reproduction gaps the old window-trim concealed — prices/PnL exact on everything matched, but whole trades missing:

  • composite-trendmaster-three-tier-ema-state-01 — 209/224 TV trades reproduced (Coverage 93.3% X; unmatched=15)
  • vwap-bands-mean-reversion-2sigma-01 — 228/241 (Coverage 94.6% X; unmatched=13)

Fairness correction: composite-4emarsi-quad-ema-stack-01 moderate→strong (98.7% coverage, exact prices).

Downstream scraper corpus (diagnostic preview): confirmed false excellents at 0.9%/1.0% coverage now grade weak with self-explanatory output (Coverage 1.0% X; unmatched=103 of 104 TV).

Gates run: py_compile + full doctest suite pass; before/after scoring sweep over 252 corpus + ~650 scraper dirs with per-row diff review.

🤖 Generated with Claude Code

Redesign of the tier ladder so every tolerance is principled and
visible, and a failing gate names a distinct bug class:

- Coverage becomes a first-class gate: matched fraction of ALL closed
  TV trades (interior-trimmed when declared), not the self-selected
  common match window. excellent >= 99% (or <=1 unmatched), strong
  >= 95%, moderate >= 75%. Previously a run reproducing 1% of TV's
  trades could grade excellent on the sliver it matched.
- Removed the heuristic escape hatches (tiny_exit_pnl_noise,
  strong_exit_pnl_coupling, pnl_validated_exit_noise, mae_intrabar/
  trail waivers). Replaced by one mechanistic rule: a per-trade PnL
  miss is forgiven only when arithmetically explained by that trade's
  own exit drift within the profile's exit tolerance
  (|dPnL| <= qty*|dExit| + cent-rounding epsilon). Exit gate owns exit
  fills; PnL gate now fails only on unexplained money drift.
- Qty-normalized PnL rescue bounded to +/-2% sizing drift; larger
  sizing divergence surfaces in the PnL gate instead of rescaling away.
- MAE joins MFE as report-only: excursions depend on intrabar path
  resolution TV sources from finer data than local OHLC; both remain
  printed as diagnostics.
- Declared tolerances kept: cent-rounding epsilon, near-zero PnL
  exclusion, per-class strict/production profiles, fragment
  consolidation + FIFO schedule scoring, documented anomaly overrides.
- Docstring canon fixed: this file IS the single source of truth
  (pineforge-utils tracks it, not vice versa); stale corpus counts
  removed. Output adds a Coverage line; all previously-scraped lines
  are byte-stable.

Corpus (same committed artifacts): excellent 229->226, strong 21->23,
moderate 1->2. The two new moderates are real reproduction gaps that
the old window trim concealed: composite-trendmaster-three-tier-ema-
state-01 (209/224 TV trades reproduced, prices exact) and vwap-bands-
mean-reversion-2sigma-01 (228/241) — both now show 'Coverage X' with
exact unmatched counts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@luisleo526 luisleo526 merged commit 0780c23 into main Jul 5, 2026
5 checks passed
@luisleo526 luisleo526 deleted the tier/hard-gates-coverage branch July 5, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant